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Abstract.  In  this  paper  we  present  a  new  algorithm  for  3D  medical  im¬ 
age  segmentation.  The  algorithm  is  fast,  relatively  simple  to  implement, 
and  semi-automatic.  It  is  based  on  minimizing  a  global  energy  defined 
from  a  learned  non-parametric  estimation  of  the  statistics  of  the  region 
to  be  segmented.  Implementation  details  are  discussed  and  source  code 
is  freely  available  as  part  of  the  3D  Sheer  project.  In  addition,  a  new 
unified  set  of  validation  metrics  is  proposed.  Results  on  artificial  and 
real  MRI  images  show  that  the  algorithm  performs  well  on  large  brain 
structures  both  in  terms  of  accuracy  and  robustness  to  noise. 


1  Introduction 

The  problem  of  segmentation,  that  is  finding  regions  in  an  image  that  are  ho¬ 
mogeneous  in  a  certain  sense,  is  central  to  the  field  of  computer  vision.  Medical 
applications,  visualization  and  quantification  methods  for  computer-aided  diag¬ 
nosis  or  therapy  planning  from  various  modalities  typically  involve  the  segmen¬ 
tation  of  anatomical  structures  as  a  preliminary  step. 

In  this  paper  we  will  consider  the  problem  of  finding  the  boundaries  of  only  one 
anatomical  region  with  limited  user  interaction.  Interactivity  is  very  desirable 
since  the  user  will  be  given  the  opportunity  to  make  use  of  often  implicit  but 
absolutely  necessary  external  knowledge  to  guide  the  algorithm  towards  a  result 
that  would  make  sense  for  her  task.  The  segmentation  process  can  be  repeated 
in  order  to  identify  as  many  different  regions  as  necessary. 

Many  different  approaches  have  been  proposed  to  address  the  segmentation  prob¬ 
lem  which  can  be  dually  considered  as  finding  regions  or  finding  boundaries. 
Focusing  only  on  the  boundaries  is  less  complex  computationally  but  also  less 
robust  since  information  inside  the  region  is  discarded.  Typically  this  is  the 
approach  of  the  snake  and  active  contours  variational  methods  [7, 16, 17]. 

While  the  original  region-growing  algorithm  [11]  formalism  is  extremely  crude, 
interesting  extensions  have  been  proposed  in  [9]  where  some  statistical  informa¬ 
tion  is  derived  from  the  region  as  it  expands.  These  techniques  have  been  applied 
to  medical  image  analysis  [12,14].  The  relation  between  region-growing  and  ac¬ 
tive  contours  has  been  studied  in  [15]  and  more  recently  active  contours  have 
been  extended  to  an  elegant  active  regions  formalism  [8]  where  the  boundaries 
of  regions  are  deformed  according  to  an  evolution  equation  derived  to  minimize 
an  energy  based  on  some  statistics  of  the  regions. 
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2  Basic  Flow 


In  this  section,  we  state  the  fundamental  flow  underpinning  the  segmentation 
method.  Let  17  be  an  open  connected  bounded  subset  of  R"  with  smooth  bound¬ 
ary  dfl.  Let  if*  :  ft  — >  R"  be  a  family  of  embeddings,  such  that  ip°  is  the  iden¬ 
tity.  Let  (f>  :  Rn  — >  R  be  a  positive  C 1  function.  We  set  17(f)  :=  ^>*(17)  and 
S(t)  :=  We  consider  the  family  of  ^-weighted  volumes 


H(t)  :=  [  (j)(^t(x))dift(x)  =  f  (f>(y)dy. 

Jn 

Set  X  =  1 1— o  -  then  using  the  area  formula  [6]  and  then  the  divergence  theo¬ 

rem,  the  first  variation  is  ^  |t=o  =  / n  div(f>X)  dx.  =  —  Jgn((fX)  ■  Af  dy,  where 
AT  is  the  inward  unit  normal  to  <917.  Consequently  the  corresponding  ^weighted 
volume  minimizing  flow  is 


dS 

~dt 


(j)Ar. 


A  different  derivation  of  the  same  result  has  previously  been  proposed  in  [4] . 


3  Method 


In  what  follows  we  will  only  consider  the  3D  case.  A  region  R  will  be  a  subset 
of  R3  with  smooth  boundary  S  =  dR.  As  above,  A f  denotes  the  corresponding 
inward  unit  normal  vector  to  S. 

Given  an  image  I,  a  non-negative  weighting  function  w(-,  •)  and  a  region  R  we 
define  the  energy: 

E(I,w,R):=  f  w(  I(x),  ||  V/(x)  ||  )  dx.  (1) 

JR, 

E  is  the  weighted  volume  of  the  region  R.  The  weight  of  a  voxel  x_  is  determined 
by  the  function  w(-,  •)  of  the  local  properties  I(x)  and  ||  V/(x)  ||  of  the  image. 
Ideally,  w  should  reflect  the  local  properties  of  the  region  we  want  to  segment. 
As  this  is  not  known  a  priori  we  will  heuristically  estimate  w  as  we  evolve  R  to 
maximize  E. 

Proposition  1.  Notation  as  above.  Then  for  a  given  weighting  function  w,  the 
evolution  in  which  the  energy  E(I,w,R)  is  decreasing  as  fast  as  possible  (using 

dS 

only  local  information)  is  =  wAf. 

Proof.  Follows  immediately  from  the  discussion  in  Section  2.  □ 

Since  w  is  a  non-negative  function,  the  flow  is  reversible.  In  particular,  the  flow 
in  the  reverse  direction, 

~m  =  -U'M-  (2) 

gives  the  direction  in  which  the  energy  is  increasing  as  fast  as  possible  (using 
local  information).  In  the  context  of  segmentation,  one  may  think  of  (2)  as  a 
bubble  and  of  the  original  flow  as  a  snake. 

Given  an  approximation  Rq  of  the  region  to  be  segmented  we  can  use  a  maximum 
likelihood-like  approach  to  determine  the  weighting  function  w o  which  would  a 
posteriori  justify  the  segmentation  of  Rq. 

Proposition  2.  For  a  given  fixed  region  Rq,  the  energy  E(I ,  u>,  Rq)  is  maxi¬ 
mized  by  setting  w  to  p o  the  conditional  probability  on  that  region: 

wo  =  argmax  E(I,w,Rq)  =  Pr(  I(x),  ||  V/(x)  ||  |  x  £  Ro  ).  (3) 

p 


Proof.  We  can  rewrite  the  energy  as: 


E(I,p,R0 ) 


N} j0  (tt,  v)  .w(u,  v)  dudv, 


JiJ\\vi\\ 

where  Nr0(u,  v)  is  the  volume  of  the  set  of  points  xGRq  such  that  I(x)  =  u  and 
II  V/(x)  ||=  v.  But  this  is  just  a  constant  multiple  of  Pr(  I{x),  ||  V/(z)  ||  |  x  £ 
R0  )  which  by  the  Schwartz’s  inequality  is  the  maximizer  of  E.  □ 

As  the  region  evolves,  p  will  be  periodically  updated  according  to  (3).  This  will 
change  the  definition  of  the  energy  (1)  and  therefore  (2)  can  only  be  considered 
a  gradient  flow  for  every  time  interval  when  w  is  fixed. 


4  Implementation 

We  implemented  our  method  as  a  module  of  the  open-source  software  3D  Slicer. 
It  is  freely  available  for  download  at  http://www.slicer.org. 

4.1  Surface  evolution 


As  the  flow  (2)  is  unidirectional  (the  surface  can  only  expand  since  w  >  0)  any 
voxel  x  will  eventually  be  reached  at  a  time  T{x).  Knowing  T  is  equivalent  to 
knowing  R  or  S  since  by  construction: 

R(t)  =  {  x,  T(x)  <t  }  and  S(t)  =  dR(t).  (4) 

Solving  the  flow  (2)  for  S(t)  is  equivalent  to  solving  the  Eikonal  equation  (5)  for 

T(x): 

II  VTfe)  ||  -w(x)  =  1.  (5) 

This  can  be  done  very  efficiently  using  the  Fast  Marching  method  [3].  Starting 
from  known  seed  points  which  define  the  initial  surface,  the  algorithm  marches 
outwards  by  considering  neighboring  voxels  and  iteratively  computing  arrival 
times  T  in  increasing  order.  The  seed  points  are  set  by  the  user  inside  the 
structure  to  be  segmented.  By  construction,  when  computing  T(x ;),  the  surface 
contains  the  voxel  x  as  well  as  all  voxels  for  which  T  has  already  been  computed. 
The  algorithm  will  terminate  when  T  is  known  for  all  points  and  using  (4)  we 
will  know  S(t )  for  any  t.  We  will  then  let  the  user  determine  what  time  to  of  the 
evolution  corresponds  best  to  the  region  she  wants. 

Note  that  with  a  very  different  formalism,  our  method  is,  in  its  implementation, 
very  reminiscent  of  region  growing.  For  example,  the  min-heap  data  structure 
which  makes  Fast  Marching  efficient  is  the  direct  equivalent  of  the  sequentially 
sorted  list  in  the  seeded  region  growing  algorithm  [9] .  In  fact  our  algorithm  could 
be  made  a  direct  non-parametric  extension  of  seeded  region  growing  simply 
by  artificially  forcing  arrival  times  to  zero  for  all  points  inside  the  surface  S. 
Relations  between  region  growing  and  variational  schemes  have  been  previously 
exposed  in  [15]. 

4.2  Estimation  of  probability  density  function 

The  probability  has  been  modified  to  p  =  Pm{iti)  ■  pn(h)  where  M  and  H  are 
the  median  and  interquartile  range  (the  difference  of  between  the  first  and  last 
quartile)  operators  on  a  3  x  3  x  3  neighborhood.  M  and  H  convey  more  or  less 
the  same  information  as  I  (gray  level)  and  ||  V/  ||  (local  homogeneity)  but  their 
non-linear  nature  makes  them  more  robust  to  noise  and  allow  them  to  respect 
better  the  edges  of  the  image. 

We  use  Parzen  windows  [10]  to  estimate  the  probability  density  functions.  It  is  a 
non-parametric  technique  and  therefore  no  assumption  is  required  on  the  shape 


of  the  distributions.  Given  a  window  function  <f>  and  N  samples  mi, . . . ,  mjv  and 
hi, ...  ,h,N  the  densities  are  estimated  by: 

1  N  1  N 
Pm (m)  =  —  ^  <j)(m  -  mi)  and  pH(h)  =  —  ^  <j>(h  -  hi) 

i=  1  i=l 


5  Validation 


Objective  and  quantitative  analysis  of  performance  is  absolutely  crucial  (but 
often  overlooked)  when  proposing  a  segmentation  algorithm.  Since  designing  a 
segmentation  method  is  challenging  (lack  of  unifying  formalism,  high  diversity 
in  the  applications,  subjectivity,  implicitness,  etc.)  it  does  not  come  as  a  sur¬ 
prise  that  the  validation  of  such  an  algorithm  will  also  be  challenging.  Different 
methods  have  been  studied  (see  [20]  and  references  therein).  We  will  propose 
a  unifying  framework  for  discrepancy  measures  based  on  the  number  and  the 
position  of  mis-segmented  voxels  and  show  how  it  relates  to  classical  measures. 
We  will  then  apply  it  to  the  validation  of  segmentation  of  realistic  synthetic 
images  (for  which  the  “ground  truth”  i.e.  perfect  segmentation  is  known)  at  dif¬ 
ferent  levels  of  noise  for  accuracy  and  robustness  assessment  as  well  as  to  manual 
expert  segmentation  of  real  datasets. 


5.1  Classical  discrepancy  measures 


Different  measures  have  been  proposed  to  assess  the  resemblance  between  a  pro¬ 
posed  segmentation  S  and  the  corresponding  ground  truth  G.  The  Dice  Similar¬ 
ity  Coefficient  has  been  widely  used  and  it  can  be  derived  as  an  approximation 
of  the  kappa  statistic  (see  [1]).  It  is  defined  as: 


DSC(S,  G)  := 


V(S  n  G) 


K V(S)+V(G)) 

Where  V (•)  is  the  volume  (number  of  voxels)  of  a  set. 

One  disadvantage  of  this  coefficient  is  that  it  only  takes  into  account  the  number 
of  mis-segmented  pixels  and  disregards  their  position  and  therefore  the  severities 
of  errors.  This  was  corrected  in  Yasnoff’s  normalized  discrepancy  measure  (ND, 
see  [18])  and  the  Factor  of  Merit  (FOM,  see  [5]): 


ND:=vY',(i)S 

i= 1 


and  FOM.  :=  — 


N 


N  ^  1 
2—1 


d(i? 


Where  N  is  the  number  of  mis-segmented  voxels  and  d(i)  is  the  error  on  the  ith 
voxel.  Another  popular  measure  is  the  Hausdorff  distance: 


H(S,  G)  :=  max{  max  min  ||  s  -  g  ||,  max  min  ||  s  -  g  ||  } 

g€G  g€G  sGo 

H(S,G)  is  the  maximum  distance  we  would  have  to  move  the  boundaries  of  one 
set  so  that  it  would  encompass  completely  the  other  set.  As  this  is  extremely 
sensitive  to  extreme  errors,  the  partial  Hausdorff  distance  Hf(S ,  G)  can  be  intro¬ 
duced  (see  [2])  as  the  maximum  distance  we  would  have  to  move  the  boundaries 
of  one  set  so  that  it  would  cover  f%  of  the  other  set. 


5.2  Proposed  framework 

Consider  now  the  error-distance: 

for  x  correctly  segmented  (x  €  S  fl  G) 
x  —  s  |j  for  x  under-segmented  (x  G  G\S) 

x  —  g  |  for  x  over-segmented  ( x  G  S\G) 

Assuming  that  all  points  xGSUG  are  equally  likely  d  can  be  seen  as  a  random 
variable  D  which  describes  completely  the  discrepancy  between  S  and  G.  We 
can  study  D  using  the  standard  statistical  tools: 

probability  of  error:  PE  :=  Pr(D  >  0) 

mean  error:  hd>o  '■=  mean (D  \  D  >  0) 

standard  deviation  of  error:  ctd>o  :=  stdev(D  |  D  >  0) 

partial  distance-error:  Df  :=  /  —  quantile(D) 

These  measures  receive  a  natural  intuitive  interpretation. 

—  PE  is  the  probability  for  a  voxel  x  G  S  ft  G  to  be  misclassified  (either  over- 
or  under-segmented) . 

—  An  erroneous  voxel  is  on  average  pd> o  pixels  off.  This  value  is  or  is  not 
typical  depending  on  the  standard  deviation  <td>o- 

—  D\-f  is  the  error  distance  of  the  worst  /%  voxels  or  equivalently  the  max¬ 
imum  distance  we  would  need  to  move  erroneous  voxels  for  the  error  to  be 
improved  to  PE  =  /. 

As  an  example,  PE  =  10%,  po> o  =  3.1,  on> o  =  0.3  and  Ho. 99  =  14  would 
mean  that  the  overlap  between  the  ground  truth  and  the  proposed  segmentation 
is  90%.  The  10%  remaning  pixels  are  either  under-segmented  or  over-segmented 
pixels  (“false  positive”  i.e.  pixels  that  are  in  S  and  not  in  G).  On  average  these 
pixels  are  3.1  pixels  off.  This  value  is  very  typical  since  the  standard  variation  is 
low  (0.3).  However  there  is  no  reason  for  the  error  to  be  Gaussian  and,  here,  the 
tail  probability  is  not  negligible  since  the  worst  1%  pixels  are  at  least  14  pixels 
off.  This  could  be  due  to  a  thin,  long  finger  of  mis-segmented  pixels. 

The  following  proposition  justifies  the  definition  of  these  new  unified  measures. 


d(x )  := 


mm 

sGS 

min 

.  g&G 


Proposition  3. 

5.1  according  to: 


These  measures  are  related  to  the  measures  presented  in  Section 


1  -  DSC  <PE  =  (1  —  DSC) /{l 


DSC 
2  ^ 


(6) 


1 

FOMe 


1  <(h2D>  0  +  °£>>o)  —  ND 


Hl-f/(l-PE)  <Dl-f  < 


(7) 

(8) 


Proof,  in  future  publication  (*n  Particular>  Di  H )  g 


5.3  Results  on  simulated  datasets 


The  publicly  available  Brain  Web  [19]  datasets  have  been  generated  from  a 
known  ground  truth  using  a  sophisticated  physical  modeling  [13]  of  the  MRI  pro¬ 
cess.  We  can  assess  in  a  perfectly  objective  way  the  performance  of  our  method 
by  comparing  the  result  of  our  segmentation  with  the  underlying  ground  truth. 
Note  that  even  though  these  datasets  are  computer-generated  they  are  very  re¬ 
alistic  (see  figure  1(b))  Another  interesting  aspect  of  this  project  is  that  from 


the  same  ground  truth,  datasets  with  different  levels  of  noise  can  be  simulated 
which  allows  us  to  study  the  robustness  of  our  method  with  respect  to  noise.  We 
segmented  the  lateral  ventricle,  white  matter  (WM)  and  white  matter  and  gray 
matter  (WM+GM)  on  2  datasets: 

—  Normal  brain,  Tl,  lxlxl  mm  (181  x  181  x  217  voxels),  3%  noise,  20% 
intensity  non-uniformity  (”RF”)  (standard  parameters  of  the  Brain  Web 
model). 

—  Normal  brain,  Tl,  lxlxl  mm  (181  x  181  x  217  voxels),  9%,  40%  (highest 
levels  of  noise  available). 

Our  results  (Table  1)  show  that  the  proposed  algorithm  gives  very  good  results 
on  these  structures  ( DSC  >  0.7  has  been  described  as  a  good  agreement  in 
the  literature,  see  for  example  [1]).  The  complex  structure  of  the  white  matter 
makes  it  more  challenging  and  explains  the  somewhat  mediocre  performance 
(in  the  case  of  the  maximum  noise  dataset,  the  cerebellum  was  not  perfectly 
segmented).  In  the  highest  level  of  noise,  connectivity  between  the  lateral  and 
the  third  ventricles  was  lost  (the  intraventricular  foramen  of  Monro  disappeared 
in  the  noise).  This  increased  the  strength  of  the  ventricle  edges  in  the  noisy 
dataset  and,  paradoxically,  simplified  the  segmentation.  Overall  the  algorithm 
appears  extremely  robust  to  noise. 


DSC 

PE 

MO  0 

j  0\D>  0 

Do. 95 

Do. 99 

Ventricle 

92.0%  95.1% 

14.9%  9.4% 

1.07  1+3 

ol 

00 

d 

1.00  +00 

1.00  L41 

WM 

91.9%  80.3% 

15.0%  32.0% 

1.59  2.03 

1.58  1.94 

1.00  2.83 

3.61  8.25 

WM+GM 

96.2%  95.2% 

7.4%  9.2% 

1.42  1.40 

1.25  1.15 

1.00  1.00 

1.41  2.00 

Table  1.  Performance  measure  on  artificial  dataset.  Left  bold,  with  standard  noise, 
right,  with  maximum  noise.  Underlined  results  are  illustrated  by  figures  1(b), 1(d), 1(f). 

5.4  Results  on  real  datasets 

In  this  real  case,  the  pathological  diagnoses  are  meningiomas  (brain  tumor). 
Patients’  heads  were  imaged  in  the  sagittal  and  axial  plane  with  a  1.5  T  MRI 
system3  with  a  postcontrast  3D  sagittal  spoiled  gradient  recalled  (SPGR)  acqui¬ 
sition  with  contiguous  slices.  The  resolution  is  0.975  x  0.975  x  1.5  mm  (256  x 
256  x  124  voxels).  These  datasets  were  manually  segmented  by  one  expert. 
Because  of  inter-  and  intra-expert  variability  we  should  expect  these  results  not 
to  be  as  good  as  in  the  synthetic  case.  It  should  also  be  noted  that  the  arbitrary 
conventions  of  the  manual  segmentations  are  responsible  for  a  lot  of  the  observed 
error  since  for  example  the  ventricle  was  labeled  as  gray  matter,  the  medulla 
oblongata  and  the  spinal  cord  have  been  left  out  etc.  (compare  Fig.  1(a)  and 
1(c)).  Overall,  nonetheless,  results  are  consistent  with  the  artificial  case. 


DSC 

PE 

Moo 

coo 

Do.  95 

Do. 99 

Tumor 

78.0%  88.0% 

36.0%  21.4% 

1.97  1.34 

1.63  0.94 

3.32  1.41 

7.00  2.83 

WM+GM 

96.1%  92.4% 

7.5%  14.2% 

1.69  1.28 

1.99  0.75 

1.00  1.00 

2.00  2.24 

Table  2.  Performance  measure  on  2  real  datasets.  Underlined  results  are  illustrated 
by  figures  1(a),  1(c),  1(e). 
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(a)  Sagittal  slice  of  real 
dataset  and  proposed 
white  and  gray  matter 
segmentation  (white) 


(c)  Expert  segmentation 
(gray)  and  proposed  white 
and  gray  matter  segmen¬ 
tation  (white) 


(e)  Rendered  surface  of 
proposed  white  and  gray 
matter  segmentation 


(b)  Axial  slice  of  noisy  ar¬ 
tificial  dataset  and  pro¬ 
posed  ventricle  segmenta¬ 
tion  (white) 


(d)  Underlying  ground 
truth  (gray)  and  pro¬ 
posed  ventricle  segmenta¬ 
tion  (white) 


(f)  Rendered  surface  of 
proposed  ventricle  seg¬ 
mentation 


6  Conclusion 


We  presented  a  new  curve  evolution  flow  based  on  learned  non-parametric  statis¬ 
tics  of  the  image.  Implementation  is  straightforward  and  efficient  using  the  Fast 
Marching  algorithm  and  is  freely  available  as  part  of  the  3D  Slicer  project.  An 
extensive  validation  study  as  well  as  a  new  unified  set  of  validation  metrics  have 
also  been  proposed. 

Future  work  will  focus  on  extending  our  formalism  into  a  purely  variational 
framework,  adding  some  regularizing  constraints  and  extending  the  validation 
study. 
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